Dataset statistics
| Number of variables | 24 |
|---|---|
| Number of observations | 150622 |
| Missing cells | 341 |
| Missing cells (%) | < 0.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 27.6 MiB |
| Average record size in memory | 192.0 B |
Variable types
| BOOL | 15 |
|---|---|
| NUM | 8 |
| CAT | 1 |
Reproduction
| Analysis started | 2020-06-09 12:34:06.913424 |
|---|---|
| Analysis finished | 2020-06-09 12:34:29.803638 |
| Duration | 22.89 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
admitdiagnosis has a high cardinality: 426 distinct values | High cardinality |
patientunitstayid is highly correlated with df_index | High correlation |
df_index is highly correlated with patientunitstayid | High correlation |
df_index has unique values | Unique |
patientunitstayid has unique values | Unique |
| Distinct count | 150622 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 81908.17210633241 |
|---|---|
| Minimum | 0 |
| Maximum | 163656 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 8087.05 |
| Q1 | 40954.25 |
| median | 81845.5 |
| Q3 | 122967.75 |
| 95-th percentile | 155618.95 |
| Maximum | 163656 |
| Range | 163656 |
| Interquartile range (IQR) | 82013.5 |
Descriptive statistics
| Standard deviation | 47272.76369 |
|---|---|
| Coefficient of variation (CV) | 0.5771434336 |
| Kurtosis | -1.201342145 |
| Mean | 81908.17211 |
| Median Absolute Deviation (MAD) | 41008.5 |
| Skewness | 0.001084073896 |
| Sum | 1.23371727e+10 |
| Variance | 2234714187 |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 66826 | 1 | < 0.1% | |
| 113949 | 1 | < 0.1% | |
| 111900 | 1 | < 0.1% | |
| 99610 | 1 | < 0.1% | |
| 105753 | 1 | < 0.1% | |
| 103704 | 1 | < 0.1% | |
| 126231 | 1 | < 0.1% | |
| 124182 | 1 | < 0.1% | |
| 130325 | 1 | < 0.1% | |
| Other values (150612) | 150612 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 163656 | 1 | < 0.1% | |
| 163655 | 1 | < 0.1% | |
| 163653 | 1 | < 0.1% | |
| 163652 | 1 | < 0.1% | |
| 163651 | 1 | < 0.1% |
| Distinct count | 150622 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1769139.3832839825 |
|---|---|
| Minimum | 141168 |
| Maximum | 3353254 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 141168 |
|---|---|
| 5-th percentile | 229562.65 |
| Q1 | 969206 |
| median | 1685783.5 |
| Q3 | 2750760.75 |
| 95-th percentile | 3206468.55 |
| Maximum | 3353254 |
| Range | 3212086 |
| Interquartile range (IQR) | 1781554.75 |
Descriptive statistics
| Standard deviation | 986592.8281 |
|---|---|
| Coefficient of variation (CV) | 0.5576682298 |
| Kurtosis | -1.308267745 |
| Mean | 1769139.383 |
| Median Absolute Deviation (MAD) | 904042.5 |
| Skewness | 0.02576842912 |
| Sum | 2.664713122e+11 |
| Variance | 9.733654085e+11 |
| Value | Count | Frequency (%) | |
| 2872418 | 1 | < 0.1% | |
| 2468827 | 1 | < 0.1% | |
| 3137052 | 1 | < 0.1% | |
| 1090878 | 1 | < 0.1% | |
| 3188823 | 1 | < 0.1% | |
| 2399545 | 1 | < 0.1% | |
| 1101107 | 1 | < 0.1% | |
| 1668259 | 1 | < 0.1% | |
| 1582382 | 1 | < 0.1% | |
| 2637101 | 1 | < 0.1% | |
| Other values (150612) | 150612 | > 99.9% |
| Value | Count | Frequency (%) | |
| 141168 | 1 | < 0.1% | |
| 141194 | 1 | < 0.1% | |
| 141197 | 1 | < 0.1% | |
| 141203 | 1 | < 0.1% | |
| 141208 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 3353254 | 1 | < 0.1% | |
| 3353251 | 1 | < 0.1% | |
| 3353235 | 1 | < 0.1% | |
| 3353226 | 1 | < 0.1% | |
| 3353216 | 1 | < 0.1% |
verbal
Real number (ℝ)
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.9633519671761097 |
|---|---|
| Minimum | -1 |
| Maximum | 5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 5 |
| Q3 | 5 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 6 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.634573332 |
|---|---|
| Coefficient of variation (CV) | 0.4124219461 |
| Kurtosis | 0.2951436799 |
| Mean | 3.963351967 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -1.330168977 |
| Sum | 596968 |
| Variance | 2.671829976 |
| Value | Count | Frequency (%) | |
| 5 | 94986 | 63.1% | |
| 1 | 25774 | 17.1% | |
| 4 | 19164 | 12.7% | |
| 3 | 5094 | 3.4% | |
| 2 | 3310 | 2.2% | |
| -1 | 2294 | 1.5% |
| Value | Count | Frequency (%) | |
| -1 | 2294 | 1.5% | |
| 1 | 25774 | 17.1% | |
| 2 | 3310 | 2.2% | |
| 3 | 5094 | 3.4% | |
| 4 | 19164 | 12.7% |
| Value | Count | Frequency (%) | |
| 5 | 94986 | 63.1% | |
| 4 | 19164 | 12.7% | |
| 3 | 5094 | 3.4% | |
| 2 | 3310 | 2.2% | |
| 1 | 25774 | 17.1% |
motor
Real number (ℝ)
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.415576741777429 |
|---|---|
| Minimum | -1 |
| Maximum | 6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 6 |
| median | 6 |
| Q3 | 6 |
| 95-th percentile | 6 |
| Maximum | 6 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.447528396 |
|---|---|
| Coefficient of variation (CV) | 0.2672897948 |
| Kurtosis | 7.570351288 |
| Mean | 5.415576742 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -2.863280855 |
| Sum | 815705 |
| Variance | 2.095338458 |
| Value | Count | Frequency (%) | |
| 6 | 118554 | 78.7% | |
| 5 | 12864 | 8.5% | |
| 1 | 7774 | 5.2% | |
| 4 | 7707 | 5.1% | |
| -1 | 2294 | 1.5% | |
| 3 | 895 | 0.6% | |
| 2 | 534 | 0.4% |
| Value | Count | Frequency (%) | |
| -1 | 2294 | 1.5% | |
| 1 | 7774 | 5.2% | |
| 2 | 534 | 0.4% | |
| 3 | 895 | 0.6% | |
| 4 | 7707 | 5.1% |
| Value | Count | Frequency (%) | |
| 6 | 118554 | 78.7% | |
| 5 | 12864 | 8.5% | |
| 4 | 7707 | 5.1% | |
| 3 | 895 | 0.6% | |
| 2 | 534 | 0.4% |
eyes
Real number (ℝ)
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.4370277914248915 |
|---|---|
| Minimum | -1 |
| Maximum | 4 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 4 |
| Q3 | 4 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.061591122 |
|---|---|
| Coefficient of variation (CV) | 0.3088689376 |
| Kurtosis | 4.084455152 |
| Mean | 3.437027791 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -2.102489131 |
| Sum | 517692 |
| Variance | 1.126975711 |
| Value | Count | Frequency (%) | |
| 4 | 106418 | 70.7% | |
| 3 | 22482 | 14.9% | |
| 1 | 11988 | 8.0% | |
| 2 | 7440 | 4.9% | |
| -1 | 2294 | 1.5% |
| Value | Count | Frequency (%) | |
| -1 | 2294 | 1.5% | |
| 1 | 11988 | 8.0% | |
| 2 | 7440 | 4.9% | |
| 3 | 22482 | 14.9% | |
| 4 | 106418 | 70.7% |
| Value | Count | Frequency (%) | |
| 4 | 106418 | 70.7% | |
| 3 | 22482 | 14.9% | |
| 2 | 7440 | 4.9% | |
| 1 | 11988 | 8.0% | |
| -1 | 2294 | 1.5% |
| Distinct count | 426 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 341 |
| Missing (%) | 0.2% |
| Memory size | 1.1 MiB |
| SEPSISPULM | 7526 |
|---|---|
| AMI | 6263 |
| CVASTROKE | 5800 |
| CHF | 5548 |
| SEPSISUTI | 4614 |
| Other values (421) |
| Value | Count | Frequency (%) | |
| SEPSISPULM | 7526 | 5.0% | |
| AMI | 6263 | 4.2% | |
| CVASTROKE | 5800 | 3.9% | |
| CHF | 5548 | 3.7% | |
| SEPSISUTI | 4614 | 3.1% | |
| DKA | 4296 | 2.9% | |
| S-CABG | 4274 | 2.8% | |
| RHYTHATR | 3963 | 2.6% | |
| EMPHYSBRON | 3810 | 2.5% | |
| PNEUMBACT | 3430 | 2.3% | |
| Other values (416) | 100757 | 66.9% |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.102109918 |
| Min length | 3 |
thrombolytics
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 | 2517 |
| Value | Count | Frequency (%) | |
| 0 | 148105 | 98.3% | |
| 1 | 2517 | 1.7% |
aids
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 | 174 |
| Value | Count | Frequency (%) | |
| 0 | 150448 | 99.9% | |
| 1 | 174 | 0.1% |
hepaticfailure
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 | 2485 |
| Value | Count | Frequency (%) | |
| 0 | 148137 | 98.4% | |
| 1 | 2485 | 1.6% |
lymphoma
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 | 709 |
| Value | Count | Frequency (%) | |
| 0 | 149913 | 99.5% | |
| 1 | 709 | 0.5% |
metastaticcancer
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 | 3158 |
| Value | Count | Frequency (%) | |
| 0 | 147464 | 97.9% | |
| 1 | 3158 | 2.1% |
leukemia
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 | 1127 |
| Value | Count | Frequency (%) | |
| 0 | 149495 | 99.3% | |
| 1 | 1127 | 0.7% |
immunosuppression
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 | 4158 |
| Value | Count | Frequency (%) | |
| 0 | 146464 | 97.2% | |
| 1 | 4158 | 2.8% |
cirrhosis
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 | 2876 |
| Value | Count | Frequency (%) | |
| 0 | 147746 | 98.1% | |
| 1 | 2876 | 1.9% |
activetx
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 88220 | 58.6% | |
| 0 | 62402 | 41.4% |
ima
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 | 4761 |
| Value | Count | Frequency (%) | |
| 0 | 145861 | 96.8% | |
| 1 | 4761 | 3.2% |
midur
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 | 1417 |
| Value | Count | Frequency (%) | |
| 0 | 149205 | 99.1% | |
| 1 | 1417 | 0.9% |
ventday1
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 114051 | 75.7% | |
| 1 | 36571 | 24.3% |
oobventday1
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 100565 | 66.8% | |
| 1 | 50057 | 33.2% |
oobintubday1
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 111335 | 73.9% | |
| 1 | 39287 | 26.1% |
diabetes
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 114559 | 76.1% | |
| 1 | 36063 | 23.9% |
creatinine
Real number (ℝ)
| Distinct count | 1624 |
|---|---|
| Unique (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.053861972354636 |
|---|---|
| Minimum | -1.0 |
| Maximum | 24.95 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | -1 |
| Q1 | 0.51 |
| median | 0.83 |
| Q3 | 1.38 |
| 95-th percentile | 4.18 |
| Maximum | 24.95 |
| Range | 25.95 |
| Interquartile range (IQR) | 0.87 |
Descriptive statistics
| Standard deviation | 1.856314306 |
|---|---|
| Coefficient of variation (CV) | 1.76143969 |
| Kurtosis | 17.70543987 |
| Mean | 1.053861972 |
| Median Absolute Deviation (MAD) | 0.43 |
| Skewness | 3.065964091 |
| Sum | 158734.798 |
| Variance | 3.445902804 |
| Value | Count | Frequency (%) | |
| -1 | 29331 | 19.5% | |
| 0.8 | 3875 | 2.6% | |
| 0.7 | 3795 | 2.5% | |
| 0.9 | 3032 | 2.0% | |
| 0.6 | 2983 | 2.0% | |
| 1.1 | 2056 | 1.4% | |
| 1 | 1972 | 1.3% | |
| 0.5 | 1839 | 1.2% | |
| 1.2 | 1778 | 1.2% | |
| 1.3 | 1531 | 1.0% | |
| Other values (1614) | 98430 | 65.3% |
| Value | Count | Frequency (%) | |
| -1 | 29331 | 19.5% | |
| 0.1 | 14 | < 0.1% | |
| 0.11 | 3 | < 0.1% | |
| 0.12 | 5 | < 0.1% | |
| 0.13 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 24.95 | 1 | < 0.1% | |
| 24.6 | 1 | < 0.1% | |
| 24.3 | 1 | < 0.1% | |
| 23.9 | 1 | < 0.1% | |
| 23.87 | 1 | < 0.1% |
dischargelocation
Real number (ℝ)
| Distinct count | 7 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.319189759796045 |
|---|---|
| Minimum | -1 |
| Maximum | 9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 4 |
| median | 4 |
| Q3 | 7 |
| 95-th percentile | 8 |
| Maximum | 9 |
| Range | 10 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.852318566 |
|---|---|
| Coefficient of variation (CV) | 0.3482332178 |
| Kurtosis | -1.07876761 |
| Mean | 5.31918976 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.7210437704 |
| Sum | 801187 |
| Variance | 3.431084071 |
| Value | Count | Frequency (%) | |
| 4 | 96566 | 64.1% | |
| 8 | 30193 | 20.0% | |
| 7 | 13366 | 8.9% | |
| 9 | 6269 | 4.2% | |
| 6 | 3372 | 2.2% | |
| 5 | 670 | 0.4% | |
| -1 | 186 | 0.1% |
| Value | Count | Frequency (%) | |
| -1 | 186 | 0.1% | |
| 4 | 96566 | 64.1% | |
| 5 | 670 | 0.4% | |
| 6 | 3372 | 2.2% | |
| 7 | 13366 | 8.9% |
| Value | Count | Frequency (%) | |
| 9 | 6269 | 4.2% | |
| 8 | 30193 | 20.0% | |
| 7 | 13366 | 8.9% | |
| 6 | 3372 | 2.2% | |
| 5 | 670 | 0.4% |
visitnumber
Real number (ℝ≥0)
| Distinct count | 8 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.0633307219396901 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.2851711407 |
|---|---|
| Coefficient of variation (CV) | 0.2681866843 |
| Kurtosis | 49.70515778 |
| Mean | 1.063330722 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.837860666 |
| Sum | 160161 |
| Variance | 0.08132257948 |
| Value | Count | Frequency (%) | |
| 1 | 142360 | 94.5% | |
| 2 | 7250 | 4.8% | |
| 3 | 824 | 0.5% | |
| 4 | 138 | 0.1% | |
| 5 | 32 | < 0.1% | |
| 6 | 11 | < 0.1% | |
| 7 | 5 | < 0.1% | |
| 8 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 142360 | 94.5% | |
| 2 | 7250 | 4.8% | |
| 3 | 824 | 0.5% | |
| 4 | 138 | 0.1% | |
| 5 | 32 | < 0.1% |
| Value | Count | Frequency (%) | |
| 8 | 2 | < 0.1% | |
| 7 | 5 | < 0.1% | |
| 6 | 11 | < 0.1% | |
| 5 | 32 | < 0.1% | |
| 4 | 138 | 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| df_index | patientunitstayid | verbal | motor | eyes | admitdiagnosis | thrombolytics | aids | hepaticfailure | lymphoma | metastaticcancer | leukemia | immunosuppression | cirrhosis | activetx | ima | midur | ventday1 | oobventday1 | oobintubday1 | diabetes | creatinine | dischargelocation | visitnumber | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 141168 | 5 | 6 | 4 | RHYTHATR | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2.30 | 9 | 1 |
| 1 | 2 | 141194 | 4 | 6 | 3 | SEPSISUTI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 2.51 | 4 | 1 |
| 2 | 3 | 141197 | 5 | 6 | 4 | SEPSISPULM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1.00 | 4 | 1 |
| 3 | 4 | 141203 | 1 | 3 | 1 | RESPARREST | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0.56 | 4 | 1 |
| 4 | 5 | 141208 | 5 | 6 | 3 | ODSEDHYP | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1.00 | 7 | 1 |
| 5 | 6 | 141227 | 4 | 6 | 3 | SEPSISPULM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 1.90 | 6 | 1 |
| 6 | 7 | 141229 | 5 | 6 | 4 | CHF | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | -1.00 | 4 | 1 |
| 7 | 8 | 141233 | 5 | 6 | 4 | S-VALVMI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | -1.00 | 4 | 1 |
| 8 | 9 | 141244 | 5 | 6 | 4 | S-FEMPGRAF | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.65 | 4 | 1 |
| 9 | 10 | 141260 | 5 | 6 | 4 | ASTHMA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1.04 | 4 | 1 |
Last rows
| df_index | patientunitstayid | verbal | motor | eyes | admitdiagnosis | thrombolytics | aids | hepaticfailure | lymphoma | metastaticcancer | leukemia | immunosuppression | cirrhosis | activetx | ima | midur | ventday1 | oobventday1 | oobintubday1 | diabetes | creatinine | dischargelocation | visitnumber | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 150612 | 163646 | 3353197 | 3 | 6 | 4 | S-CABGAOV | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0.71 | 4 | 1 |
| 150613 | 163647 | 3353198 | 1 | 4 | 2 | COMA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1.01 | 4 | 4 |
| 150614 | 163648 | 3353200 | 5 | 6 | 4 | HYPOVOLEM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0.92 | 4 | 5 |
| 150615 | 163649 | 3353201 | 5 | 6 | 3 | PLEUREFFUS | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | -1.00 | 4 | 3 |
| 150616 | 163650 | 3353213 | -1 | -1 | -1 | COMA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0.68 | 7 | 1 |
| 150617 | 163651 | 3353216 | 1 | 5 | 1 | S-CYSTOTH | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0.73 | 7 | 1 |
| 150618 | 163652 | 3353226 | -1 | -1 | -1 | PLEUREFFUS | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | -1.00 | 9 | 1 |
| 150619 | 163653 | 3353235 | 5 | 6 | 4 | CHF | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1.00 | 8 | 1 |
| 150620 | 163655 | 3353251 | 1 | 1 | 1 | CARDARREST | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 2.43 | 8 | 1 |
| 150621 | 163656 | 3353254 | 5 | 6 | 4 | LOWGIBLEED | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | -1.00 | 4 | 1 |